-
Notifications
You must be signed in to change notification settings - Fork 7
Retention-based Partition Dropping #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1ec5067
to
34755cb
Compare
kruti-s
approved these changes
Sep 12, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some suggestions about potentially overlooked errors and test cases
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Determine which partitions are wholly outside of the configured
retention_period
for a given table and emit commands to drop them.I'd just like to write for posterity that this is getting real hacky, even more so now with this branch rebased on top of the last year's worth of fixes.
To give just a hint of the deltas of the current version vs. what it was designed to do:
Partman gathers the Partition Table of each table, and attaches to each Partition the beginning and end positions, and then knowledge of how to derive the beginning and end timestamps. However, we know now that because MariaDB selects Partitions based on an incoming row matching any part of a partition, the partition offsets in the partition table are just guideposts. In reality, one must
SELECT ... FROM table PARTITION(partition_name) ...
to have actual truth of what the positions of the beginning and end of the partition. Deriving it from the table is dangerous, because it works until it doesn't, and that's where #64 comes from.Partman also then assumed in the beginning that getting the start and end timestamps for a partition could come from the partition names, which leads to all these
ALTER
statements to keep the partition names roughly true. But that's hard to keep up with, and if we have a narrow set of partitions it ends up with giving very few data points from which to calculate a rate of change. But all the methods in Partman to calculate this stuff does so from the perspective of "calculate the rate of change of these partitions", but in reality we shouldn't care about the partitions at all. We should instead use the primary key, select the min, max, and then sample points between the two to get a better rate of change. Basically, we assumed in the beginning we could do this withoutearliest_utc_timestamp_query
, but now that's basically mandatory. And we should use that.And these decisions are baked into the myriad
Partition
types intypes.py
, which are all far too complicated.Anyway, this thing needs a rewrite to become much simpler.